BUT-TYPED: Using domain knowledge for computing typed similarity
نویسندگان
چکیده
This paper deals with knowledge-based text processing which aims at an intuitive notion of textual similarity. Entities and relations relevant for a particular domain are identified and disambiguated by means of semi-supervised machine learning techniques and resulting annotations are applied for computing typedsimilarity of individual texts. The work described in this paper particularly shows effects of the mentioned processes in the context of the *SEM 2013 pilot task on typed-similarity, a part of the Semantic Textual Similarity shared task. The goal is to evaluate the degree of semantic similarity between semi-structured records. As the evaluation dataset has been taken from Europeana – a collection of records on European cultural heritage objects – we focus on computing a semantic distance on field author which has the highest potential to benefit from the domain knowledge. Specific features that are employed in our system BUT-TYPED are briefly introduced together with a discussion on their efficient acquisition. Support Vector Regression is then used to combine the features and to provide a final similarity score. The system ranked third on the attribute author among 15 submitted runs in the typed-similarity task.
منابع مشابه
The Comparison of Typed and Handwritten Essays of Iranian EFL Students in terms of Length, Spelling, and Grammar
This study attempted to compare typed and handwritten essays of Iranian EFL students in terms of length, spelling, and grammar. To administer the study, the researchers utilized Alice Touch Typing Tutor software to select 15 upper intermediate students with higher ability to write two essays: one typed and the other handwritten. The students were both males and females between the ages of 22 to...
متن کاملThe Use of the Typed Lambda Calculus for Guiding Naive Users in the Representation and Acquisition of Part-Whole Knowledge
We address the task of enabling naive users in a practical context to define, comprehend and use knowledge bases for representing part-whole information. This work is part of a larger effort whose target users were ecologists who had little experience in mathematics, computing, and artificial intelligence, but who wished to build computer simulation models of ecological systems. The ecological ...
متن کاملUBC_UOS-TYPED: Regression for typed-similarity
We approach the typed-similarity task using a range of heuristics that rely on information from the appropriate metadata fields for each type of similarity. In addition we train a linear regressor for each type of similarity. The results indicate that the linear regression is key for good performance. Our best system was ranked third in the task.
متن کاملPolyUCOMP-CORE_TYPED: Computing Semantic Textual Similarity using Overlapped Senses
The Semantic Textual Similarity (STS) task aims to exam the degree of semantic equivalence between sentences (Agirre et al., 2012). This paper presents the work of the Hong Kong Polytechnic University (PolyUCOMP) team which has participated in the STS core and typed tasks of SemEval2013. For the STS core task, the PolyUCOMP system disambiguates words senses using contexts and then determine sen...
متن کاملParsing in Dialogue Systems Using Typed Feature Structures Parsing in Dialogue Systems Using Typed Feature Structures Memoranda Informatica 95525
The analysis of natural language in the context of keyboard-driven dialogue systems is the central issue addressed in this paper. A module that corrects typing errors and performs domain-speci c morphological analysis has been developed. A parser for typed uni cation grammars is designed and implemented in C++; for description of the lexicon and the grammer a specialised speci cation language h...
متن کامل